Dependence versus Conditional Dependence in Local Causal Discovery from Gene Expression Data

نویسندگان

  • Eric V. Strobl
  • Shyam Visweswaran
چکیده

Motivation: Algorithms that discover variables which are causally related to a target may inform the design of experiments. With observational gene expression data, many methods discover causal variables by measuring each variable’s degree of statistical dependence with the target using dependence measures (DMs). However, other methods measure each variable’s ability to explain the statistical dependence between the target and the remaining variables in the data using conditional dependence measures (CDMs), since this strategy is guaranteed to find the target’s direct causes, direct effects, and direct causes of the direct effects in the infinite sample limit. In this paper, we design a new algorithm in order to systematically compare the relative abilities of DMs and CDMs in discovering causal variables from gene expression data. Results: The proposed algorithm using a CDM is sample efficient, since it consistently outperforms other state-of-the-art local causal discovery algorithms when samples sizes are small. However, the proposed algorithm using a CDM outperforms the proposed algorithm using a DM only when sample sizes are above several hundred. These results suggest that accurate causal discovery from gene expression data using current CDM-based algorithms requires datasets with at least several hundred samples. Availability: The proposed algorithm is freely available at https://github.com/ericstrobl/DvCD.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning causal network structure from multiple (in)dependence models

We tackle the problem of how to use information from multiple (in)dependence models, representing results from different experiments, including background knowledge, in causal discovery. We introduce the framework of a causal system in an external context to derive a connection between strict conditional independencies and causal relations between variables. Constraint-based causal discovery is...

متن کامل

Conditional Dependence in Longitudinal Data Analysis

Mixed models are widely used to analyze longitudinal data. In their conventional formulation as linear mixed models (LMMs) and generalized LMMs (GLMMs), a commonly indispensable assumption in settings involving longitudinal non-Gaussian data is that the longitudinal observations from subjects are conditionally independent, given subject-specific random effects. Although conventional Gaussian...

متن کامل

Inferring gene regulatory networks from gene expression data by PC-algorithm based on conditional mutual information

Motivation: Reconstruction of gene regulatory networks (GRNs), which explicitly represent the causality of developmental or regulatory process, is of utmost interest and has become a challenging computational problem for understanding the complex regulatory mechanisms in cellular systems. However, all existing methods of inferring GRNs from gene expression profiles have their strengths and weak...

متن کامل

Evaluation of the Association of Htr2a Gene Rs6313 Polymorphism with Heroin Dependence in a Sample from Northwest Iran

Introduction: Heroin dependence is a chronic relapsing disorder caused by a combination of genetic, epigenetic, and environmental factors. The genetic contribution in the vulnerability to heroin dependence is 40%-60%. Alterations in dopamine transport in the CNS are implicated in drug and alcohol dependence, and according to linkage studies, the HTR2A rs6313 single nucleotide polymorphism plays...

متن کامل

Discovery and Visualization of Nonstationary Causal Models

There are several issues with causal discovery from fMRI. First, the sampling frequency is so low that the time-delayed dependence between different regions is very small, making time-delayed causal relations weak and unreliable. Moreover, the complex correspondence between neural activity and the BOLD signal makes it difficult to formulate a causal model to represent the effect as a function o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1407.7566  شماره 

صفحات  -

تاریخ انتشار 2014